How Did Covid Affect Seattle Housing Price

Introduction

After going to school in Seattle(University of Washington) for 4 years, I found Seattle to have an very unique personality. It has a balanced blend of metropolitan and nature; It’s surrounded by the ocean and mountains; It has traditions while the people are also extremely forward thinking. It is a charming place and here are some of the pictures that I took while I was there.

Thompson Hall in March Suzzallo Library in a Winter Afternoon Evening at Montlake

At the same time, Seattle is group zero for covid in the US, as it has seen the first U.S covid case. On the individual scale, everyone’s life has changed since then. On the scale of the world, it has slowed down the global’s economy for years. Housing prices have always been somewhat of an indicator of the state of the economy, since the desire of buying a house is relateable to the vast majority. Therefore, the aim of this project is to perform forecast as if covid did not happen and study how C ovid affected housing prices in Seattle.

Section 1: Initial Analysis

Initial analysis is done in this first section. A plot/histogram of the raw Time series is plotted as well as the plot of the the Time series differences at lag 1. Then a series of ACF and PACF plots are produced for more context.

After differences at lag = 1 to remove seasonality. The TS plot shows a significant drop around k = 300. That would’ve been the result of the Covid 19 pandemic.

Therefore,the data set will leave out the last 50 data points to build the model. From the rest of the data set, the last 12 data points will be used as a validation set

Section 1.1: Transformation

In this section, a box cox plot is generated to check if any transformation is necessary:

Log transformation gave a more symmetric histogram.

Section 1.2: Remove Trend/Seasonality

TS Differenced at Different Lags TS Variance
TS Difference at Lag 1 0.0001118
TS Difference at Lag 1 Wwice 0.0000284
TS Difference at Lag 1 Twice, Then at Lag 12 0.0000281
TS Difference at lag 1 twice, Then at Lag 12, and at Lag 1 0.0000412

From the variance chart above. Differencing at lag 1 twice and lag 12 once produces the lowest variance. Further differencinhg actually increases the variance. Therefore, the project will proceed with differencing at lag 1 twice and lag 12 once.

Section 2: Model Identification/Selection

Section 2.1 Identify/ Fit possible Model #1

Upon examining the ACF, it’s outside the CI at log 1 and 12. Since the seasonality is 12. This could indicate Q = 1 and q = 1. Upon examining the PACF, it demonstrate a exponential decay pattern between the years as well as within the year. Therefore, the first candid model is SARIMA(0,2,1)(0,1,1) s =12.

Fit 1 Coefficients
Coef S.E
ma1 -0.3559014 0.0761016
sma1 -0.8811392 0.0544728

Upon examining the CI for both coefficients. 0 is not contained in the CI, therefore, both coefficients are significant.

Section 2.2 Diagnostic checking of Model #1

Diagnostic Tests for Fit 1
Test P-Value
Shapiro Wilk 0.0000000
Box-Pierce 0.3393664
Ljung-Box 0.3116902
Mcleod-Li 0.6204202

Section 2.3 Identify/ Fit possible Model #2

Upon examining the ACF, it’s outside the CI at log 1 and 12. Since the seasonality is 12. This could indicate Q = 1 and q = 1. Upon examining the PACF, the seasonal pattern can be interpreted as P = 3, within the year, p =1 . Therefore, the second model is SARIMA(1,2,1)(2,1,1) s =12.

Fit 2 Coefficients
Coef S.E
ar1 -0.0526922 0.0696598
ma1 -0.3054904 0.4000291
sar1 -0.1284971 0.0741804
sar2 -0.1930561 0.0502129
sma1 -0.7730429 0.0778808

From the estimate of the coefficients, one could construct a CI for the coefficients. The CI for ar1 contains 0, therefore, the model will be adjusted to SARIMA(0,2,1)(2,1,1) s =12.

Coef S.E
ma1 -0.3544822 0.0763583
sar1 -0.1254902 0.1016225
sar2 -0.1907180 0.0905919
sma1 -0.7736512 0.1013464

From the estimate of the coefficients, one could construct a CI for the coefficients. The CI for sar1 contains 0, therefore, the model will be adjusted to SARIMA(0,2,1)(2,1,1) s =12 with the coefficient for sar1 being fixed to be 0.

Coef S.E
ma1 -0.3563616 0.0763440
sar1 0.0000000 0.0000000
sar2 -0.1398785 0.0794084
sma1 -0.8440430 0.0642367

From the estimate of the coefficients, one could construct a CI for the coefficients. The CI for sar2 contains 0, therefore, the model will be adjusted to SARIMA(0,2,1)(0,1,1) s =12 with the coefficient for sar1 being fixed to be 0. This model reduces to the first fit.

Therefore, this project will proceed with the first fit.

Section 3: Forcast

Forecast will be implemented in this section.

This next section is a forecast on the original data:

Now add the original TS data(in black)

The Prediction is done for Jan, 1, 2019. Test CI is very narrow. The actual TS data is touching the lower bound of the CI. This could be a heavy tail distribution. Residuals test also suggets it’s head-tail distribution

Section 4: Examine How Covid Changed Seattle’s Housing Prices

The more advanced goal of this project is to study how Covid changed Seattle’s Housing Market. A 48 head prediction will put the TS right in the middle of the pandemic

Section 5: Spectral Analysis

## [1] 0.6488167

The P value Fisher test is larger than 0.05. Fail to reject the null hypothesis that the residual is Gaussian WN.

Pass KS test

Section 6: Conclusion

Heavy Tailed Distribution would work better

Covid hurt the housing market in SEA

Section 7: References

All the pictures in this project were taken by me.

This data set is obtained from Zillow’s Website: Zillow Website

The Lecture slides helped to build this project: 274 Lecture Slides

A Huge Thank you to Prof.Feldman For Teaching and Helping me with this project!

Appendix